智能论文笔记

BatmanNet: Bi-branch Masked Graph Transformer Autoencoder for Molecular Representation

Zhen Wang , Zheng Feng , Yanjun Li , Bowen Li , Yongrui Wang , Chulin Sha , Min He , Xiaolin Li

分类：机器学习

2022-11-25

Although substantial efforts have been made using graph neural networks (GNNs) for AI-driven drug discovery (AIDD), effective molecular representation learning remains an open challenge, especially in the case of insufficient labeled molecules. Recent studies suggest that big GNN models pre-trained by self-supervised learning on unlabeled datasets enable better transfer performance in downstream molecular property prediction tasks. However, they often require large-scale datasets and considerable computational resources, which is time-consuming, computationally expensive, and environmentally unfriendly. To alleviate these limitations, we propose a novel pre-training model for molecular representation learning, Bi-branch Masked Graph Transformer Autoencoder (BatmanNet). BatmanNet features two tailored and complementary graph autoencoders to reconstruct the missing nodes and edges from a masked molecular graph. To our surprise, BatmanNet discovered that the highly masked proportion (60%) of the atoms and bonds achieved the best performance. We further propose an asymmetric graph-based encoder-decoder architecture for either nodes and edges, where a transformer-based encoder only takes the visible subset of nodes or edges, and a lightweight decoder reconstructs the original molecule from the latent representation and mask tokens. With this simple yet effective asymmetrical design, our BatmanNet can learn efficiently even from a much smaller-scale unlabeled molecular dataset to capture the underlying structural and semantic information, overcoming a major limitation of current deep neural networks for molecular representation learning. For instance, using only 250K unlabelled molecules as pre-training data, our BatmanNet with 2.575M parameters achieves a 0.5% improvement on the average AUC compared with the current state-of-the-art method with 100M parameters pre-trained on 11M molecules.

translated by 谷歌翻译

PP-OCRv3: More Attempts for the Improvement of Ultra Lightweight OCR System

Chenxia Li , Weiwei Liu , Ruoyu Guo , Xiaoting Yin , Kaitao Jiang , Yongkun Du , Yuning Du , Lingfeng Zhu , Baohua Lai , Xiaoguang Hu

分类：计算机视觉

2022-06-07

如图1所示，光学特征识别（OCR）技术已在各种场景中广泛使用。设计实用的OCR系统仍然是一项有意义但具有挑战性的任务。在以前的工作中，考虑到效率和准确性，我们提出了实用的超轻型OCR系统（PP-OCR）和优化的版本PP-OCRV2。为了进一步提高PP-OCRV2的性能，本文提出了更强大的OCR系统PP-OCRV3。 PP-OCRV3基于PP-OCRV2的9个方面升级了文本检测模型和文本识别模型。对于文本检测器，我们引入了一个带有大型接收场LK-PAN的锅模块，该模块是一个名为RSE-FPN的剩余注意机制的FPN模块和DML蒸馏策略。对于文本识别器，基本模型将从CRNN替换为SVTR，我们介绍了轻量级文本识别网络SVTR LCNET，通过注意力进行CTC的指导培训，数据增强策略TextConaug，由自我审查的TextRotnet，UDML和UDML和UDML和UDML和更好的预培训模型。 UIM加速模型并改善效果。实际数据上的实验表明，在可比的推理速度下，PP-OCRV3的Hmean比PP-OCRV2高5％。上述所有上述型号都是开源的，并且代码可在由PaddlePaddle供电的GitHub存储库Paddleocr中可用。

translated by 谷歌翻译

ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation

Shuohuan Wang , Yu Sun , Yang Xiang , Zhihua Wu , Siyu Ding , Weibao Gong , Shikun Feng , Junyuan Shang , Yanbin Zhao , Chao Pang

分类：自然语言处理

2021-12-23

预先接受的语言模型实现了最先进的导致各种自然语言处理（NLP）任务。 GPT-3表明，缩放预先训练的语言模型可以进一步利用它们的巨大潜力。最近提出了一个名为Ernie 3.0的统一框架，以预先培训大型知识增强型号，并培训了具有10亿参数的模型。 Ernie 3.0在各种NLP任务上表现出最先进的模型。为了探讨缩放的表现，我们培养了百卢比的3.0泰坦参数型号，在PaddlePaddle平台上有高达260亿参数的泰坦。此外，我们设计了一种自我监督的对抗性损失和可控语言建模损失，以使ERNIE 3.0 TITAN产生可信和可控的文本。为了减少计算开销和碳排放，我们向Ernie 3.0泰坦提出了一个在线蒸馏框架，教师模型将同时教授学生和培训。埃塞尼3.0泰坦是迄今为止最大的中国密集预训练模型。经验结果表明，Ernie 3.0泰坦在68个NLP数据集中优于最先进的模型。

translated by 谷歌翻译

Investigating Glyph Phonetic Information for Chinese Spell Checking: What Works and What's Next

Xiaotian Zhang , Yanjun Zheng , Hang Yan , Xipeng Qiu

分类：自然语言处理 | 人工智能

2022-12-08

While pre-trained Chinese language models have demonstrated impressive performance on a wide range of NLP tasks, the Chinese Spell Checking (CSC) task remains a challenge. Previous research has explored using information such as glyphs and phonetics to improve the ability to distinguish misspelled characters, with good results. However, the generalization ability of these models is not well understood: it is unclear whether they incorporate glyph-phonetic information and, if so, whether this information is fully utilized. In this paper, we aim to better understand the role of glyph-phonetic information in the CSC task and suggest directions for improvement. Additionally, we propose a new, more challenging, and practical setting for testing the generalizability of CSC models. All code is made publicly available.

translated by 谷歌翻译

Launchpad: Learning to Schedule Using Offline and Online RL Methods

Vanamala Venkataswamy , Jake Grigsby , Andrew Grimshaw , Yanjun Qi

分类：机器学习

2022-12-01

Deep reinforcement learning algorithms have succeeded in several challenging domains. Classic Online RL job schedulers can learn efficient scheduling strategies but often takes thousands of timesteps to explore the environment and adapt from a randomly initialized DNN policy. Existing RL schedulers overlook the importance of learning from historical data and improving upon custom heuristic policies. Offline reinforcement learning presents the prospect of policy optimization from pre-recorded datasets without online environment interaction. Following the recent success of data-driven learning, we explore two RL methods: 1) Behaviour Cloning and 2) Offline RL, which aim to learn policies from logged data without interacting with the environment. These methods address the challenges concerning the cost of data collection and safety, particularly pertinent to real-world applications of RL. Although the data-driven RL methods generate good results, we show that the performance is highly dependent on the quality of the historical datasets. Finally, we demonstrate that by effectively incorporating prior expert demonstrations to pre-train the agent, we short-circuit the random exploration phase to learn a reasonable policy with online training. We utilize Offline RL as a \textbf{launchpad} to learn effective scheduling policies from prior experience collected using Oracle or heuristic policies. Such a framework is effective for pre-training from historical datasets and well suited to continuous improvement with online data collection.

translated by 谷歌翻译

RARE: Renewable Energy Aware Resource Management in Datacenters

Vanamala Venkataswamy , Jake Grigsby , Andrew Grimshaw , Yanjun Qi

分类：人工智能

2022-11-10

The exponential growth in demand for digital services drives massive datacenter energy consumption and negative environmental impacts. Promoting sustainable solutions to pressing energy and digital infrastructure challenges is crucial. Several hyperscale cloud providers have announced plans to power their datacenters using renewable energy. However, integrating renewables to power the datacenters is challenging because the power generation is intermittent, necessitating approaches to tackle power supply variability. Hand engineering domain-specific heuristics-based schedulers to meet specific objective functions in such complex dynamic green datacenter environments is time-consuming, expensive, and requires extensive tuning by domain experts. The green datacenters need smart systems and system software to employ multiple renewable energy sources (wind and solar) by intelligently adapting computing to renewable energy generation. We present RARE (Renewable energy Aware REsource management), a Deep Reinforcement Learning (DRL) job scheduler that automatically learns effective job scheduling policies while continually adapting to datacenters' complex dynamic environment. The resulting DRL scheduler performs better than heuristic scheduling policies with different workloads and adapts to the intermittent power supply from renewables. We demonstrate DRL scheduler system design parameters that, when tuned correctly, produce better performance. Finally, we demonstrate that the DRL scheduler can learn from and improve upon existing heuristic policies using Offline Learning.

translated by 谷歌翻译

DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing

Yanjun Gao , Dmitriy Dligach , Timothy Miller , John Caskey , Brihat Sharma , Matthew M Churpek , Majid Afshar

分类：自然语言处理 | 人工智能

2022-09-29

The meaningful use of electronic health records (EHR) continues to progress in the digital era with clinical decision support systems augmented by artificial intelligence. A priority in improving provider experience is to overcome information overload and reduce the cognitive burden so fewer medical errors and cognitive biases are introduced during patient care. One major type of medical error is diagnostic error due to systematic or predictable errors in judgment that rely on heuristics. The potential for clinical natural language processing (cNLP) to model diagnostic reasoning in humans with forward reasoning from data to diagnosis and potentially reduce the cognitive burden and medical error has not been investigated. Existing tasks to advance the science in cNLP have largely focused on information extraction and named entity recognition through classification tasks. We introduce a novel suite of tasks coined as Diagnostic Reasoning Benchmarks, DR.BENCH, as a new benchmark for developing and evaluating cNLP models with clinical diagnostic reasoning ability. The suite includes six tasks from ten publicly available datasets addressing clinical text understanding, medical knowledge reasoning, and diagnosis generation. DR.BENCH is the first clinical suite of tasks designed to be a natural language generation framework to evaluate pre-trained language models. Experiments with state-of-the-art pre-trained generative language models using large general domain models and models that were continually trained on a medical corpus demonstrate opportunities for improvement when evaluated in DR. BENCH. We share DR. BENCH as a publicly available GitLab repository with a systematic approach to load and evaluate models for the cNLP community.

translated by 谷歌翻译

Hierarchical Graph Pooling is an Effective Citywide Traffic Condition Prediction Model

Shilin Pu , Liang Chu , Zhuoran Hou , Jincheng Hu , Yanjun Huang , Yuanjian Zhang

分类：机器学习

2022-09-08

准确的交通状况预测为车辆环境协调和交通管制任务提供了坚实的基础。由于道路网络数据在空间分布中的复杂性以及深度学习方法的多样性，有效定义流量数据并充分捕获数据中复杂的空间非线性特征变得具有挑战性。本文将两种分层图池方法应用于流量预测任务，以减少图形信息冗余。首先，本文验证了流量预测任务中层次图池方法的有效性。分层图合并方法与其他基线在预测性能上形成鲜明对比。其次，应用了两种主流分层图池方法，节点群集池和节点下降池，用于分析流量预测中的优势和弱点。最后，对于上述图神经网络，本文比较了不同图网络输入对流量预测准确性的预测效应。分析和汇总定义图网络的有效方法。

translated by 谷歌翻译

Transfer Learning and Vision Transformer based State-of-Health prediction of Lithium-Ion Batteries

Pengyu Fu , Liang Chu , Zhuoran Hou , Jincheng Hu , Yanjun Huang , Yuanjian Zhang

分类：计算机视觉 | 人工智能

2022-09-07

近年来，在运输电气化方面取得了重大进展。作为主要的储能设备，锂离子电池（LIB）已受到广泛关注。准确地预测健康状况（SOH）不仅可以缓解用户对电池寿命的焦虑，而且还可以为电池管理提供重要信息。本文提出了一种基于视觉变压器（VIT）模型的SOH的预测方法。首先，预定义电压范围的离散充电数据用作输入数据矩阵。然后，电池的循环特征是由VIT捕获的，可以获得可以获得全局特征，并且通过将循环特征与完整连接（FC）层相结合来获得SOH。同时，引入了转移学习（TL），并根据目标任务电池的早期周期数据进一步微调基于源任务电池训练的预测模型，以提供准确的预测。实验表明，与现有的深度学习方法相比，我们的方法可以获得更好的特征表达，从而可以实现更好的预测效果和传递效果。

translated by 谷歌翻译

Summarizing Patients Problems from Hospital Progress Notes Using Pre-trained Sequence-to-Sequence Models

Yanjun Gao , Dmitry Dligach , Timothy Miller , Dongfang Xu , Matthew M. Churpek , Majid Afshar

分类：自然语言处理 | 人工智能

2022-08-17

使用自然语言处理方法自动汇总患者的主要进度注释中的主要问题，有助于与医院环境中的信息和认知超负荷作斗争，并可能为提供者提供计算机化的诊断决策支持。问题列表摘要需要一个模型来理解，抽象和生成临床文档。在这项工作中，我们提出了一项新的NLP任务，旨在在住院期间使用提供者进度注释的意见来在患者的日常护理计划中生成一系列问题。我们研究了两个最先进的SEQ2SEQ变压器体系结构T5和Bart的性能，以解决此问题。我们提供了一个基于公开可用的电子健康记录进度注释MART MART（MIMIC）-III中的公开电子健康记录进度注释的语料库。 T5和BART对通用域文本进行了培训，我们尝试了数据增强方法和域适应性预训练方法，以增加医学词汇和知识的接触。评估方法包括胭脂，Bertscore，嵌入句子上的余弦相似性以及对医学概念的F评分。结果表明，与基于规则的系统和通用域预训练的语言模型相比，具有领域自适应预训练的T5可实现显着的性能增长，这表明可以解决问题摘要任务的有希望的方向。

translated by 谷歌翻译